remotes::install_github("nsgrantham/ggbraid")1 The task
In this take-home exercise, you are required to uncover the impact of COVID-19 as well as the global economic and political dynamic in 2022 on Singapore bi-lateral trade (i.e. Import, Export and Trade Balance) by using appropriate analytical visualisation techniques learned in Lesson 6: It’s About Time. Students are encouraged to apply appropriate interactive techniques to enhance user and data discovery experiences.
The write-up of the take-home exercise should include but not limited to the followings:
- Describe the selection and designed consideration of the analytical data visualisation used. The discussion should limit to not more than 150 words each.
- A reproducible description of the procedures used to prepare the analytical visualisation. Please refer to the peer submission I shared.
- A write-up of not more than 100 words to discuss the patterns reveal by each analytical visualization prepared.
Packages
- tidyverse:
- readxl: This package makes it easy to get data out of Excel and into R. It has no external dependencies, so it’s easy to install and use on all operating systems. It is designed to work with tabular data. The easiest way to install the latest released version from CRAN is to install the whole tidyverse. Note that you will still need to load readxl explicitly, because it is not a core tidyverse package loaded via
library(tidyverse) - tidyr: Tidy data describes a standard way of storing data that is used wherever possible throughout the tidyverse. If you ensure that your data is tidy, you’ll spend less time fighting with the tools and more time working on your analysis.
- lubridate: Functions to work with date-times and time-spans: fast and user friendly parsing of date-time data, extraction and updating of components of a date-time (years, months, days, hours, minutes, and seconds), algebraic manipulation on date-time and time-span objects. The easiest way to get lubridate is to install the whole tidyverse.
- readxl: This package makes it easy to get data out of Excel and into R. It has no external dependencies, so it’s easy to install and use on all operating systems. It is designed to work with tabular data. The easiest way to install the latest released version from CRAN is to install the whole tidyverse. Note that you will still need to load readxl explicitly, because it is not a core tidyverse package loaded via
- plotly: is an R package for creating interactive web-based graphs via the open source JavaScript graphing library
plotly.js. - d3scatter: is an HTML R widget for interactive scatter plots visualization. It is based on the htmlwidgets R package and on the d3.js javascript library.
- ggbraid: ggbraid provides a new stat,
stat_braid(), that extends the functionality ofgeom_ribbon()to correctly fill the area between two alternating lines (or steps) with two different colors. ggbraid also provides a geom,geom_braid(), that wrapsgeom_ribbon()and usesstat_braid()by default. You can install the development version of ggbraid from GitHub with:
ggplot2: A system for ‘declaratively’ creating graphics, based on “The Grammar of Graphics”. You provide the data, tell ‘ggplot2’ how to map variables to aesthetics, what graphical primitives to use, and it takes care of the details.
- TimeSeries: This R package offers novel time series visualisations. It is based on
ggplot2and offersgeoms and pre-packaged functions for easily creating any of the offered charts. - hrbrthemes: A compilation of extra ‘ggplot2’ themes, scales and utilities, including a spell check function plot label fields and an overall emphasis on typography.
- TimeSeries: This R package offers novel time series visualisations. It is based on
tmap: With the tmap package, thematic maps can be generated with great flexibility. The syntax for creating plots is similar to that of
ggplot2, but tailored to maps.CGPfunctions: Miscellaneous functions useful for teaching statistics as well as actually practicing the art. They typically are not new methods but rather wrappers around either base R or other packages. This package is used to create slope graph in this project.
Use the code chunk below to call the packages required for this project.
packages = c('readxl', 'datawizard', 'crosstalk', 'tidyr', 'lubridate','tidyverse', 'plotly', 'd3scatter','tidyquant', 'ggbraid', 'ggTimeSeries', 'CGPfunctions', 'tmap', 'ggplot2', 'hrbrthemes')
for(p in packages){
if(!require(p,character.only = T)){
install.packages(p)
}
library(p,character.only = T)
}2 Data Preparation
Merchandise Trade provided by Department of Statistics, Singapore (DOS) is used. The study period is between January 2020 to December 2022.
Step 1. Checking the number of sheets it contains
excel_sheets("data/data.xlsx")[1] "Content" "T1" "T2"
Step 2. Importing data
In the code chunk below, read_xlsx() of readxl package is used to import the data worksheet of our data workbook into R.
T1 <- read_xlsx("data/data.xlsx", sheet = "T1")
T2 <- read_xlsx("data/data.xlsx", sheet = "T2")Step 3. Transpose the fat table to long table
Gather takes multiple columns and collapses into key-value pairs, duplicating all other columns as needed. You use gather() when you notice that you have columns that are not variables. This function is under tidyr package.
T1 <- gather(T1, "MonthYear", "ImportValue", -`Data Series`)
T2 <- gather(T2, "MonthYear", "ExportValue", -`Data Series`)| Data Series | MonthYear | ImportValue |
|---|---|---|
| Total Merchandise Imports (Thousand Dollars) | 2022 Dec | 49869770 |
| America (Million Dollars) | 2022 Dec | 6901.5 |
| Asia (Million Dollars) | 2022 Dec | 33611.7 |
| Europe (Million Dollars) | 2022 Dec | 7541.8 |
| Oceania (Million Dollars) | 2022 Dec | 1399.9 |
| Data Series | MonthYear | ExportValue |
|---|---|---|
| Total Merchandise Exports (Thousand Dollars) | 2022 Dec | 55000084 |
| America (Million Dollars) | 2022 Dec | 6217.5 |
| Asia (Million Dollars) | 2022 Dec | 39734.8 |
| Europe (Million Dollars) | 2022 Dec | 4924.4 |
| Oceania (Million Dollars) | 2022 Dec | 3034.8 |
Step 4. Convert MonthYear column to date format
T1$`MonthYear` <- ym(T1$`MonthYear`)
T2$`MonthYear` <- ym(T2$`MonthYear`)
# Convert ImportValue column to numeric format
T1$`ImportValue` <- as.numeric(T1$`ImportValue`)
T2$`ExportValue` <- as.numeric(T2$`ExportValue`)Step 5. Separate region and country
# =================== Import =================== #
Region <- T1 %>%
filter(grepl('Million', `Data Series`)) %>%
rename("Region" = "ImportValue")
Country <- T1 %>%
filter(grepl('Thousand', `Data Series`)) %>%
rename("Country" = "ImportValue")
Import <- full_join(Region, Country, by = join_by(`Data Series`, `MonthYear`))
Import <- gather(Import , "Level", "ImportValue", -`Data Series`, -`MonthYear`)
# =================== Export =================== #
Region <- T2 %>%
filter(grepl('Million', `Data Series`)) %>%
rename("Region" = "ExportValue")
Country <- T2 %>%
filter(grepl('Thousand', `Data Series`)) %>%
rename("Country" = "ExportValue")
Export <- full_join(Region, Country, by = join_by(`Data Series`, `MonthYear`))
Export <- gather(Export , "Level", "ExportValue", -`Data Series`, -`MonthYear`)Step 6. Filter year from 2016 and rename column
Import <- Import %>%
filter(`MonthYear`> as.Date("2015-12-01")) %>%
rename(`Country` = `Data Series`)
Export <- Export %>%
filter(`MonthYear`> as.Date("2015-12-01")) %>%
rename(`Country` = `Data Series`)Step 7. Merge Import and Export into one table
wide <- full_join(Import, Export, by = join_by(`Country`, `MonthYear`,`Level`))
wide <- wide %>%
mutate("Diff" = ImportValue-ExportValue) %>%
mutate("Total" = ImportValue+ExportValue)
wide$`Country` <- str_replace(wide$`Country`, "Mainland China", "China")
wide$`Country` <- str_replace_all(wide$`Country`, " \\(|Thousand Dollars|\\)", "")
wide$`Country` <- str_replace_all(wide$`Country`, " \\(|Million Dollars|\\)", "")
long <- gather(wide , "Type", "Value", -`Country`, -`MonthYear`,-`Level`)And here we finally achieve both wide table and long table ready for our analysis.
Table Wide : Merchandise Imports/Export By Region/Market, Monthly
| Country | MonthYear | Level | ImportValue | ExportValue | Diff | Total |
|---|---|---|---|---|---|---|
| America | 2022-12-01 | Region | 6901.5 | 6217.5 | 684.0 | 13119.0 |
| Asia | 2022-12-01 | Region | 33611.7 | 39734.8 | -6123.1 | 73346.5 |
| Europe | 2022-12-01 | Region | 7541.8 | 4924.4 | 2617.4 | 12466.2 |
| Oceania | 2022-12-01 | Region | 1399.9 | 3034.8 | -1634.9 | 4434.7 |
| Africa | 2022-12-01 | Region | 414.9 | 1088.6 | -673.7 | 1503.5 |
Table Long : Merchandise Imports/Export By Region/Market, Monthly
| Country | MonthYear | Level | Type | Value |
|---|---|---|---|---|
| America | 2022-12-01 | Region | ImportValue | 6901.5 |
| Asia | 2022-12-01 | Region | ImportValue | 33611.7 |
| Europe | 2022-12-01 | Region | ImportValue | 7541.8 |
| Oceania | 2022-12-01 | Region | ImportValue | 1399.9 |
| Africa | 2022-12-01 | Region | ImportValue | 414.9 |
3 Visualizations
3.1 Scatter plot Dashboard
Step 1. Create scatter plot with plotly
Code
# Plot scatter plot
fig <- wide %>%
plot_ly(
x = ~`ImportValue`,
y = ~`ExportValue`,
color = ~`Country`,
frame = ~as.character(`MonthYear`, format = "%Y-%m"),
size = ~`Total`,
sizes = c(10,1000),
text= ~paste("Country:",`Country`,
"\nImport Value:", `ImportValue`, " Thousand Dollars",
"\nExport Value:", `ExportValue`, " Thousand Dollars",
"\nMonth Year:", `MonthYear`),
hoverinfo = "text",
type = 'scatter',
mode = 'markers'
)Step 2. Set up layout
Code
# Create the diagonal line
dline <- function(color = "steelblue") {
list(
type = "line",
yref = "paper",
xref = "paper",
y0 = 0, y1 = 1,
x0 = 0, x1 = 1,
line = list(color = color, dash="dot")
)
}
# Setup layout
fig <- fig %>%
layout(title = list(text="Singapore bi-lateral trade volume"),
subtitle = "2016-2022",
hoverlabel = list(align = "left"),
shapes = dline(),
legend = list(orientation = "h", y = 1, x = 0),
showlegend = FALSE,
xaxis = list(title="Import Value", range = list(0, 10000000)),
yaxis = list(title="Export Value", range = list(0, 10000000)),
width=650,
height=650
)Step 3. Set up animation
Code
fig <- fig %>%
animation_opts(
500, easing = "linear", redraw = FALSE
)
# Animation slider
fig <- fig %>% animation_slider(
currentvalue = list(prefix = "MONTH-YEAR :", font = list(color="red"))
)
figInsights from scatter plot
- Mainland China is range with a High Import - High Export spectrum. From 2020 to 2021, there was a significant increase in value, particularly export value.
- Taiwan import rate has continuously grown but we can observed significant jump of import value in 2021 after COVID.
3.2 Slope Graph
Step 1. Prepare data for slope graph
Code
# Import
slopeimport <- wide %>%
subset(Level == "Region") %>%
mutate(month = month(MonthYear)) %>%
mutate(year = year(MonthYear)) %>%
drop_na() %>%
group_by(Country, year) %>%
summarise(sumyear = sum(ImportValue)) %>%
mutate(Year = factor(year)) %>%
arrange(`sumyear`)
#Export
slopeexport <- wide %>%
subset(Level == "Region") %>%
mutate(month = month(MonthYear)) %>%
mutate(year = year(MonthYear)) %>%
drop_na() %>%
group_by(Country, year) %>%
summarise(sumyear = sum(ExportValue)) %>%
mutate(Year = factor(year)) %>%
arrange(`sumyear`)Step 2. Create slope plot
Code
# Create slope plot for import
p <- newggslopegraph(dataframe = slopeimport,
Times = `Year`,
Measurement = `sumyear`,
Grouping = `Country`,
Title = "Total Import per Year by Region",
SubTitle = "2016-2022",
Caption = NULL)
p + annotate("rect",
xmin = "2020",
xmax = "2022",
ymin = -1,
ymax = 600000,
alpha = .1,
fill = "yellow")
# Create slope plot for export
p2 <- newggslopegraph(dataframe = slopeexport,
Times = `Year`,
Measurement = `sumyear`,
Grouping = `Country`,
Title = "Total Export per Year by Region",
SubTitle = "2016-2022",
Caption = NULL)
p2 + annotate("rect",
xmin = "2020",
xmax = "2022",
ymin = -1,
ymax = 600000,
alpha = .1,
fill = "yellow")

Insights from slope graph
- Mainland China is range with a High Import - High Export spectrum. From 2020 to 2021, there was a significant increase in value, particularly export value.
- Taiwan import rate has continuously grown but we can observed significant jump of import value in 2021 after COVID.
3.3 Line Plot
Step 1. Prepare data for line plot of Singapore
Code
singapore <- wide %>%
drop_na() %>%
group_by(MonthYear) %>%
summarise(import = sum(ImportValue), export = sum(ExportValue))
singaporeribbon <- gather(singapore , "Type", "Value", -`MonthYear`)Step 2. Prepare data for line plot of Singapore
Code
ggplot() +
geom_line(aes(`MonthYear`, `Value`, linetype = `Type`),
data = singaporeribbon,
show.legend = FALSE) +
geom_braid(aes(`MonthYear`,
ymin = `import`,
ymax = `export`,
fill = `import`>`export`),
data = singapore,
alpha = 0.6,
method = 'line')+
annotate("rect",
xmin = as.Date("2020-01-01"),
xmax = as.Date("2022-12-01"),
ymin = 0,
ymax = 65000000,
alpha = .1,
fill = "yellow")
Step 3. Prepare data for line plot of China
Code
braid <- wide %>%
select(`Country`, `MonthYear`, `Level`, `ImportValue`, `ExportValue`) %>%
drop_na() %>%
subset(Country == "China")
ribbon <- gather(braid , "Type", "Value", -`Country`, -`MonthYear`, -`Level`)Step 2. Plot line plot of China
Code
ggplot() +
geom_line(aes(`MonthYear`, `Value`, linetype = `Type`),
data = ribbon,
show.legend = FALSE) +
geom_braid(aes(`MonthYear`,
ymin = `ImportValue`,
ymax = `ExportValue`,
fill = `ImportValue`>`ExportValue`),
data = braid,
alpha = 0.6,
method = 'line')+
annotate("rect",
xmin = as.Date("2020-01-01"),
xmax = as.Date("2022-12-01"),
ymin = 0,
ymax = 10000000,
alpha = .1,
fill = "yellow")
Insights from line plot
- Mainland China is range with a High Import - High Export spectrum. From 2020 to 2021, there was a significant increase in value, particularly export value.
- Taiwan import rate has continuously grown but we can observed significant jump of import value in 2021 after COVID.
3.4 Heat Map
Step 1. Prepare data by selecting top difference between import and export
Code
heatmap <- wide %>%
drop_na() %>%
group_by(Country) %>%
mutate(totaldiff = sum(Diff)) %>%
arrange(totaldiff) %>%
subset(-55000000>totaldiff|totaldiff>55000000)Step 2. Plot heatmap
Code
p <- heatmap %>%
ggplot(aes(x = MonthYear, y = reorder(Country,totaldiff), fill= Diff)) +
scale_fill_distiller(palette = "RdPu") +
theme_ipsum() +
geom_tile() +
theme(axis.text.x = element_text(angle = 90, vjust = 1.5, hjust=1.5))+
theme(axis.text.y = element_text(size = 10, vjust = 1.5, hjust=1.5))+
labs(title = "Trade balance", x="", y="")
ggplotly(p, tooltip= "text")Insights from heatmap
- Mainland China is range with a High Import - High Export spectrum. From 2020 to 2021, there was a significant increase in value, particularly export value.
- Taiwan import rate has continuously grown but we can observed significant jump of import value in 2021 after COVID.
3.5 Cycle Plot
Step 1: Deriving month and year fields
Code
cycle <- wide
cycle$month <- month(cycle$`MonthYear`)
cycle$year <- year(cycle$`MonthYear`)Step 2: Extracting the target country
Code
cycle <- cycle %>%
subset(`Country`== "China") %>%
drop_na()Step 3: Computing year average import by month
Code
x <- cycle %>%
select(Country, month, year, Total) %>%
group_by(month) %>%
summarise(avg = mean(Total))
hline.data <- cycle %>%
group_by(`month`) %>%
mutate(avgvalue = mean(`Total`))Step 4: Plotting the cycle plot
Code
ggplot() +
geom_line(data=cycle,
aes(x=year,
y=Diff,
group=month),
colour="black") +
geom_hline(aes(yintercept=avgvalue),
data=hline.data,
linetype=6,
colour="red",
linewidth=0.5) +
facet_grid(~month) +
labs(axis.text.x = element_blank(),
title = "xxxxxxxxxxxxxxxxxxxxxxxxx") +
xlab("") +
ylab("Import Value") +
theme(plot.title = element_text(size=22),
axis.text.x = element_text(size = 10, angle = 90),
axis.text.y = element_text(size = 10),
strip.text = element_text(size = 10))
Pay Attention
Using callouts is an effective way to highlight content that your reader give special consideration or attention.
3.6 Choropleth Map
Step 1. Prepare data for Choropleth map
Code
data("World")
map <- World %>%
select(iso_a3, name, sovereignt, geometry)
map$name <- as.character(map$name)
map$sovereignt <- as.character(map$sovereignt)Step 2. Create animated map
Code
data_map_area <- map %>%
full_join(wide, by = c('sovereignt' = 'Country')) %>%
drop_na()
tmap_mode("view")
choropleth <- tm_shape(data_map_area) +
tm_polygons("Total") +
tm_facets(along = "MonthYear", free.coords = FALSE)
tmap_animation(choropleth , filename = "choropleth.gif", delay = 25)
4 Interactive Dashboard
Code
# Prepare data for dashboard
line <- long %>%
subset(Type == "ImportValue"|Type == "ExportValue")
# Building interactive filters
d <- highlight_key(line)
filter_tools <- htmltools::div(
filter_select(id = "filter",
label = "Select Country",
sharedData = d,
group = ~Country,
multiple=FALSE),
filter_slider(id = "period",
label = "Select period",
sharedData = d,
column = ~year(MonthYear),
width = "100%"),
filter_slider(id = "value",
label = "Select Value",
sharedData = d,
column = ~Value,
width = "100%"),
filter_checkbox(id = "variable",
label = "Select variable",
sharedData = d,
group = ~Type,
inline = FALSE))
vline <- function(x = 0, color = "steelblue") {
list(
type = "line",
y0 = 0, y1 = 1,
yref = "paper",
x0 = x, x1 = x,
line = list(color = color, dash="dot")
)
}
# plotting interactive scatter plot using plotly
p <- plot_ly(data=d,
type= "scatter",
mode= "line",
x= ~MonthYear,
y= ~Value,
color= ~Type,
colors= "Accent",
# fill = 'tonexty',
text= ~paste("Country:",`Country`,
"\nMonth Year:", `MonthYear`,
"\nType:",`Type`)) %>%
layout(title = list(text="<b>Import/Export trend by country</b>"),
hoverlabel = list(align = "left"),
legend = list(orientation = "h", y = 1, x = 0),
shapes = vline("2020"),
xaxis = list(title="Month Year"),
yaxis = list(title="Value"))
gg <- highlight(p, "plotly_selected")
# Using crosstalk bscols to put all 3 elements (filter, scatter plot, datatable) together.
crosstalk::bscols(filter_tools,gg,DT::datatable(d, class= "display",
filter=list(position="top", clear=FALSE),
options=list(pageLength = 10,scrollY = TRUE,
iDisplayLength = 25)),
widths = c(4, 8, 12),
annotations = list(caption = "Data from Department of Statistics, Singapore (DOS)"))Code
function filter_default() {
document.getElementById("filter").getElementsByClassName("selectized")
[0].selectize.setValue("China", false);
}
window.onload = filter_default;Code
# Building interactive filters
# d <- highlight_key(ribbon)
# # d2 <- highlight_key(braid)
#
# filter_tools <- htmltools::div(
# filter_select(id = "country",
# label = "Select Country",
# sharedData = d,
# group = ~Country,
# multiple=FALSE),
#
# filter_slider(id = "period",
# label = "Select period",
# sharedData = d,
# column = ~year(MonthYear),
# width = "100%"))
#
# # plotting interactive scatter plot using plotly
# p <- ggplot() +
# geom_line(aes(`MonthYear`, `Value`, linetype = `Type`), data = ribbon)
# # +
# # geom_braid(aes(`MonthYear`,
# # ymin = `ImportValue`,
# # ymax = `ExportValue`,
# # fill = `ImportValue`>`ExportValue`),
# # data = braid, alpha = 0.6) +
# # guides(linetype = "none", fill = "none")
#
#
# gg <- highlight(p, "plotly_selected")
#
# # Using crosstalk bscols to put all 3 elements (filter, scatter plot, datatable) together.
# crosstalk::bscols(filter_tools, gg, widths = c(4, 8))